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The present invention relates to the field of Internet technology. 
Specifically, the present invention relates to the creation and management of 
5 custom World Wide Web sites. 



DESCRIPTION OF RELATED ART 

The World Wide Web (the Web) represents all of the computers on 
10 the Internet that offer users access to information on the Internet via 

interactive documents or Web pages. These Web pages contain hypertext 
links that are used to connect any combination of graphics, audio, video and 
text, in a non-linear, non-sequential manner. Hypertext links are created 
using a special software language known as HyperText Mark-Up Language 
15 (HTML). 



Once created, Web pages reside on the Web, on Web servers or Web 
sites. A Web site can contain numerous Web pages. Web client machines 
running Web browsers can access these Web pages at Web sites via a 

20 communications protocol known as HyperText Transport Protocol (HTTP). 
Web browsers are software interfaces that rim on World Wide Web clients to 
allow access to Web sites via a simple user interface. A Web browser allows a 
Web client to request a particular Web page from a Web site by specifying a 
Uniform Resource Locator (URL). A URL is a Web address that identifies the 

25 Web page and its location on the Web. When the appropriate Web site 
receives the URL, the Web page corresponding to the requested URL is 
located, and if required, HTML output is generated. The HTML output is 
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then sent via HTTP to the client for formatting on the clients screen. 

Although Web pages and Web sites are extremely simple to create, the 
proliferation of Web sites on the Internet highlighted a number of problems. 
5 The scope and ability of a Web page designer to change the content of the 
Web page was limited by the static nature of Web pages. Once created, a Web 
page remained static until it was manually modified. This in turn limited 
the ability of Web site managers to effectively manage their Web sites. 

10 The Common Gateway Interface (CGI) standard was developed to 

resolve the problem of allowing dynamic content to be included in Web 
pages. CGI "calls' 1 or procedures enable applications to generate dynamically 
created HTML output, thus creating Web pages with dynamic content. Once 
created, these CGI applications do not have to be modified in order to 

15 retrieve "new" or dynamic data. Instead, when the Web page is invoked, 

CGI "calls" or procedures are used to dynamically retrieve the necessary data 
and to generate a Web page. 

CGI applications also enhanced the ability of Web site administrators 
20 to manage Web sites. Administrators no longer have to constantly update 
static Web pages. A number of vendors have developed tools for CGI based 
development, to address the issue of dynamic Web page generation. 
Companies like Spider™ and Bluestone™, for example, have each created 
development tools for CGI-based Web page development. Another 
25 company, Haht Software™, has developed a Web page generation tool that 
uses a BASIC-like scripting language, instead of a CGI scripting language. 
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Tools that generate CGI applications do not, however, resolve the 
problem of managing numerous Web pages and requests at a Web site. For 
example, a single company may maintain hundreds of Web pages at their 
Web site. Current Web server architecture also does not allow the Web 
5 server to efficiently manage the Web page and process Web client requests. 
Managing these hundreds of Web pages in a coherent manner and 
processing all requests for access to the Web pages is thus a difficult task. 
Existing development tools are limited in their capabilities to facilitate 
dynamic Web page generation, and do not address the issue of managing 
10 Web requests or Web sites. 
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SUMMARY OF THE INVENTION 

It is therefore an object of the present invention to provide a method 
and apparatus for creating and managing custom Web sites. Specifically, the 
present invention claims a method and apparatus for managing dynamic 
web page generation requests. 

In one embodiment, the present invention claims a computer- 
implemented method for managing a dynamic Web page generation request 
to a Web server, the computer-implemented method comprising the steps of 
routing the request from the Web server to a page server, the page server 
receiving the request and releasing the Web server to process other requests, 
processing the request, the processing being performed by the page server 
concurrently with the Web server, as the Web server processes the other 
requests, and dynamically generating a Web page in response to the request, 
the Web page including data dynamically retrieved from one or more data 
sources. Other embodiments also include connection caches to the one or 
more data sources, page caches for each page server, and custom HTML 
extension templates for configuring the Web page. 

Other objects, features and advantages of the present invention will be 
apparent from the accompanying drawings and from the detailed 
description. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustrates a typical computer system in which the present 
invention operates. 

5 

Figure 2^illustrates a typical prior art Web server environment. 



y/Figure 3 illustrates a typical prior art Web server environment in the 
form of a flow diagram. 
10 / 

/ Figure 4 illustrates one embodiment of the presently claimed 

invention. 

.//' 




Figure 5 illustrates the processing of a Web browser request in the 
15 form of a flow diagram, according to one embodiment of the presently 
claimed invention. 



//v. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

The present invention relates to a method and apparatus for 
creating and managing custom Web sites. In the following detailed 
5 description, numerous specific details are set forth in order to provide a 

thorough understanding of the present invention. It will be apparent to one 
of ordinary skill in the art, however, that these specific details need not be 
used to practice the present invention. In other instances, well-known 
structures, interfaces and processes have not been shown in detail in order 
10 not to unnecessarily obscure the present invention. 

Figure 1 illustrates a typical computer system 100 in which the present 
invention operates. The preferred embodiment of the present invention is 
implemented on an IBM™ Personal Computer manufactured by IBM 
15 Corporation of Armonk, New York. An alternate embodiment may be 
implemented on an RS/6000™ Workstation manufactured by IBM 
Corporation of Armonk, New York. It will be apparent to those of ordinary 
skill in the art that other computer system architectures may also be 
employed. 

20 

In general, such computer systems as illustrated by Figure 1 comprise a 
bus 101 for communicating information, a processor 102 coupled with the 
bus 101 for processing information, main memory 103 coupled with the bus 
101 for storing information and instructions for the processor 102, a read- 
25 only memory 104 coupled with the bus 101 for storing static information and 
instructions for the processor 102, a display device 105 coupled with the bus 
101 for displaying information for a computer user, an input device 106 
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coupled with the bus 101 for communicating information and command 
selections to the processor 102, and a mass storage device 107, such as a 
magnetic disk and associated disk drive, coupled with the bus 101 for storing 
information and instructions. A data storage medium 108 containing digital 
5 information is configured to operate with mass storage device 107 to allow 
processor 102 access to the digital information on data storage medium 108 
via bus 101. 

Processor 102 may be any of a wide variety of general purpose 
10 processors or microprocessors such as the Pentium™ microprocessor 
manufactured by Intel™ Corporation or the RS/6000™ processor 
manufactured by IBM Corporation. It will be apparent to those of ordinary 
skill in the art, however, that other varieties of processors may also be used 
in a particular computer system. Display device 105 may be a liquid crystal 
15 device, cathode ray tube (CRT), or other suitable display device. Mass storage 
device 107 may be a conventional hard disk drive, floppy disk drive, CD- 
ROM drive, or other magnetic or optical data storage device for reading and 
writing information stored on a hard disk, a floppy disk, a CD-ROM a 
magnetic tape, or other magnetic or optical data storage medium. Data 
20 storage medium 108 may be a hard disk, a floppy disk, a CD-ROM, a magnetic 
tape, or other magnetic or optical data storage medium. 

In general, processor 102 retrieves processing instructions and data 
from a data storage medium 108 using mass storage device 107 and 
25 downloads this information into random access memory 103 for execution. 
Processor 102, then executes an instruction stream from random access 
memory 103 or read-only memory 104. Command selections and 



7 




information input at input device 106 are used to direct the flow of 
instructions executed by processor 102. Equivalent input device 106 may also 
be a pointing device such as a conventional mouse or trackball device. The 
results of this processing execution are then displayed on display device 105. 

5 

The preferred embodiment of the present invention is implemented 
as a software module, which may be executed on a computer system such as 
computer system 100 in a conventional manner. Using well known 
techniques, the application software of the preferred embodiment is stored 
10 on data storage medium 108 and subsequently loaded into and executed 
within computer system 100. Once initiated, the software of the preferred 
embodiment operates in the manner described below. 

Figure 2 illustrates a typical prior art Web server environment. Web 
15 client 200 can make URL requests to Web server 201 or Web server 202. Web 
servers 201 and 202 include Web server executables, 201(E) and 202(E) 
respectively, that perform the processing of Web client requests. Each Web 
server may have a number of Web pages 201(1) - (n) and 202(1) - (n). 
Depending on the URL specified by the Web client 200, the request may be 
20 routed by either Web server executable 201(E) to Web page 201 (1), for 

example, or from Web server executable 202(E) to Web page 202 (1). Web 
client 200 can continue making URL requests to retrieve other Web pages. 
Web client 200 can also use hyperlinks within each Web page to "jump" to 
other Web pages or to other locations within the same Web page. 

25 

Figure 3 illustrates this prior art Web server environment in the form 
of a flow diagram. In processing block 300, the Web client makes a URL 
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request. This URL request is examined by the Web browser to determine the 
appropriate Web server to route the request to in processing block 302. In 
processing block 304 the request is then transmitted from the Web browser to 
the appropriate Web server, and in processing block 306 the Web server 
executable examines the URL to determine whether it is a HTML document 
or a CGI application. If the request is for an HTML document 308, then the 
Web server executable locates the document in processing block 310. The 
document is then transmitted back through the requesting Web browser for 
formatting and display in processing block 312. 

If the URL request is for a CGI application 314, however, the Web 
server executable locates the CGI application in processing block 316. The 
CGI application then executes and outputs HTML output in processing block 
318 and finally, the HTML output is transmitted back to requesting Web 
browser for formatting and display in processing block 320. 

This prior art Web server environment does not, however, provide 
any mechanism for managing the Web requests or the Web sites. As Web 
sites grow, and as the number of Web clients and requests increase, Web site 
20 management becomes a crucial need. 

For example, a large Web site may receive thousands of requests or 
"hits" in a single day. Current Web servers process each of these requests on 
a single machine, namely the Web server machine. Although these 
25 machines may be running "multi-threaded" operating systems that allow 
transactions to be processed by independent "threads," all the threads are 
nevertheless on a single machine, sharing a processor. As such, the Web 
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executable thread may hand off a request to a processing thread, but both 
threads will still have to be handled by the processor on the Web server 
machine. When numerous requests are being simultaneously processed by 
multiple threads on a single machine, the Web server can slow down 
5 significantly and become highly inefficient. The claimed invention 

addresses this need by utilizing a partitioned architecture to facilitate the 
creation and management of custom Web sites and servers. 

Figure 4 illustrates one embodiment of the presently claimed 

10 invention. Web client 200 issues a URL request that is processed to 

determined proper routing. In this embodiment, the request is routed to 
Web server 201. Instead of Web server executable 201(E) processing the URL 
request, however, Interceptor 400 intercepts the request and routes it to 
Dispatcher 402. In one embodiment, Interceptor 400 resides on the Web 

15 server machine as an extension to Web server 201. This embodiment is 
appropriate for Web servers such as Netsite™ from Netscape, that support 
such extensions. A number of public domain Web servers, such as NCSA™ 
from the National Center for Supercomputing Applications at the 
University of Illinois, Urb ana-Champaign, however, do not provide support 

20 for this type of extension. Thus, in an alternate embodiment, Interceptor 400 
is an independent module, connected via an "intermediate program" to Web 
server 201. This intermediate program can be a simple CGI application 
program that connects Interceptor 400 to Web server 201. Alternate 
intermediate programs the perform the same functionality can also be 

2 5 implemented. 

In one embodiment of the invention, Dispatcher 402 resides on a 



10 



different machine than Web server 201. This embodiment overcomes the 
limitation described above, in prior art Web servers, wherein all processing 
is performed by the processor on a single machine. By routing the request to 
Dispatcher 402 residing on a different machine than the Web server 
executable 201(E), the request can then be processed by a different processor 
than the Web server executable 201(E). Web server executable 201(E) is thus 
free to continue servicing client requests on Web server 201 while the 
request is processed "of f-line," at the machine on which Dispatcher 402 
resides. 

Dispatcher 402 can, however, also reside on the same machine as the 
Web server. The Web site administrator has the option of configuring 
Dispatcher 402 on the same machine as Web server 201, taking into account 
a variety of factors pertinent to a particular Web site, such as the size of the 
Web site, the number of Web pages and the number of hits at the Web site. 
Although this embodiment will not enjoy the advantage described above, 
namely off-loading the processing of Web requests from the Web server 
machine, the embodiment does allow flexibility for a small Web site to grow. 
For example, a small Web site administrator can use a single machine for 
both Dispatcher 402 and Web server 201 initially, then off-load Dispatcher 402 
onto a separate machine as the Web site grows. The Web site can thus take 
advantage of other features of the present invention regardless of whether 
the site has separate machines configured as Web servers and dispatchers. 

Dispatcher 402 receives the intercepted request and then dispatches the 
request to one of a number of Page servers 404 (1) - (n). For example, if Page 
server 404 (1) receives the dispatched request, it processes the request and 
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retrieves the data from an appropriate data source, such as data source 406, 
data source 408, or data source 410. Data sources, as used in the present 
application, include databases, spreadsheets, files and any other type of data 
repository. Page server 404 (1) can retrieve data from more than one data 
5 source and incorporate the data from these multiple data sources in a single 
Web page. 

In one embodiment, each Page server 404(1) - (n) resides on a separate 
machine on the network to distribute the processing of the request. 

10 Dispatcher 402 maintains a variety of information regarding each Page server 
on the network, and dispatches requests based on this information. For 
example, Dispatcher 402 retains dynamic information regarding the data 
sources that any given Page server can access. Dispatcher 402 thus examines 
a particular request and determines which Page servers can service the URL 

15 request. Dispatcher 402 then hands off the request to the appropriate Page 
server. 

For example, if the URL request requires financial data from data 
source 408, dispatcher 402 will first examine an information list. Dispatcher 
20 402 may determine that Page server 404(3), for example, has access to the 
requisite data in data source 408. Dispatcher 402 will thus route the URL 
request to Page server 404(3). This "connection caching 1 ' functionality is 
described in more detail below, under the heading ,, Performance. ,, 

25 Alternately, Dispatcher 402 also has the ability to determine whether a 

particular Page server already has the necessary data cached in the Page 
server's page cache (described in more detail below, under the heading 
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"Performance"). Dispatcher 402 may thus determine that Page server 404(1) 
and 404(2) are both logged into Data source 408, but that Page server 404(2) 
has the financial information already cached in Page server 404(2)'s page 
cache. In this case, Dispatcher 402 will route the URL request to Page server 
5 404(2) to more efficiently process the request. 

Finally, Dispatcher 402 may determine that a number or all Page 
servers 404(1) - (n) are logged into Data source 408. In this scenario, 
Dispatcher 402 can examine the number of requests that each Page server is 
10 servicing and route the request to the least busy page server. This "load 
balancing" capability can significantly increase performance at a busy Web 
site and is discussed in more detail below, under the heading "Scalability". 

If, for example, Page server 404(2), receives the request, Page server 
15 404(2) will process the request. While Page server 404(2) is processing the 
request, Web server executable 201(E) can concurrently process other Web 
client requests. This partitioned architecture thus allows both Page server 
404(2) and Web server executable 201(E) to simultaneously process different 
requests, thus increasing the efficiency of the Web site. Page server 404(2) 
20 dynamically generates a Web page in response to the Web client request, and 
the dynamic Web page is then either transmitted back to requesting Web 
client 200 or stored on a machine that is accessible to Web server 201, for later 
retrieval. 

25 One embodiment of the claimed invention also provides a Web page 

designer with HTML extensions, or "dyna" tags. These dyna tags provide 
customized HTML functionality to a Web page designer, to allow the 
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designer to build customized HTML templates that specify the source and 
placement of retrieved data. For example, in one embodiment, a "dynatext" 
HTML extension tag specifies a data source and a column name to allow the 
HTML template to identify the data source to log into and the column name 
5 from which to retrieve data. Alternatively, "dyna-anchor" tags allow the 
designer to build hyperlink queries while "dynablock" tags provide the 
designer with the ability to iterate through blocks of data. Page servers use 
these HTML templates to create dynamic Web pages. Then, as described 
above, these dynamic Web pages are either transmitted back to requesting 
10 Web client 200 or stored on a machine that is accessible to Web server 201, for 
later retrieval. 

The presently claimed invention provides numerous advantages over 
prior art Web servers, including advantages in the areas of performance, 
15 security, extensibility and scalability. 



caching and page caching to improve performance. Each Page server can be 
configured to maintain a cache of connections to numerous data sources. 
For example, as illustrated in Figure 4, Page server 404(1) can retrieve data 
from data source 406, data source 408 or data source 410. Page server 404(1) 
25 can maintain connection cache 412(1), containing connections to each of data 
source 406, data source 408 and data source 410, thus eliminating connect 
times from the Page servers to those data sources. 



Performance 
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One embodiment of the claimed invention utilizes connection 
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Additionally, another embodiment of the present invention supports 
the caching of finished Web pages, to optimize the performance of the data 
source being utilized. This "page caching" feature, illustrated in Figure 4 as 
Page cache 414, allows the Web site administrator to optimize the 
performance of data sources by caching Web pages that are repeatedly 
accessed. Once the Web page is cached, subsequent requests or "hits" will 
utilize the cached Web page rather than re-accessing the data source. This 
can radically improve the performance of the data source. 



Security 

The present invention allows the Web site administrator to utilize 
15 multiple levels of security to manage the Web site. In one embodiment, the 
Page server can utilize all standard encryption and site security features 
provided by the Web server. In another embodiment, the Page server can be 
configured to bypass connection caches 412(l)-(n), described above, for a 
particular data source and to require entry of a user-supplied identification 
20 and password for the particular data source the user is trying to access. 

Additionally, another embodiment of the presently claimed invention 
requires no real-time access of data sources. The Web page caching ability, 
described above, enables additional security for those sites that want to 
25 publish non-interactive content from internal information systems, but do 
not want real-time Internet accessibility to those internal information 
systems. In this instance, the Page server can act as a "replication and staging 
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agent" and create Web pages in batches, rather than in real-time. These 
"replicated" Web pages are then "staged" for access at a later time, and access 
to the Web pages in this scenario is possible even if the Page server and 
dispatcher are not present later. 

5 

In yet another embodiment, the Page server can make a single pass 
through a Web library, and compile a Web site that exists in the traditional 
form of separately available files. A Web library is a collection of related Web 
books and Web pages. More specifically, the Web library is a hierarchical 
10 organization of Web document templates, together with all the associated 
data source information. Information about an entire Web site is thus 
contained in a single physical file, thus simplifying the problem of deploying 
Web sites across multiple Page servers. The process of deploying the Web 
site in this embodiment is essentially a simple copy of a single file. 

15 

Extensibility 

One embodiment of the present invention provides the Web site 
20 administrator with Object Linking and Embedding (OLE) 2.0 extensions to 
extend the page creation process. These OLE 2.0 extensions also allow 
information submitted over the Web to be processed with user-supplied 
functionality. Utilizing development tools such as Visual Basic, Visual C++ 
or PowerBuilder that support the creation of OLE 2.0 automation, the Web 
25 site administrator can add features and modify the behavior of the Page 
servers described above. This extensibility allows one embodiment of the 
claimed invention to be incorporated with existing technology to develop an 
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infinite number of custom web servers. 

For example, OLE 2.0 extensions allow a Web site administrator to 
encapsulate existing business rules in an OLE 2.0 automation interface, to be 
5 accessed over the Web. One example of a business rule is the steps involved 
in the payoff on an installment or mortgage loan. The payoff may involve, 
for example, taking into account the current balance, the date and the interest 
accrued since the last payment. Most organizations already have this type of 
business rule implemented using various applications, such as Visual Basic 
10 for client-server environments, or CICS programs on mainframes. If these 
applications are OLE 2.0 compliant, the Page server "dynaobject" HTML 
extension tag can be used to encapsulated the application in an OLE 2.0 
automation interface. The Page server is thus extensible, and can incorporate 
the existing application with the new Page server functionality. 

15 

Scalability 

One embodiment of the claimed invention allows "plug and play" 
20 scalability. As described above, referring to Figure 4, Dispatcher 402 

maintains information about all the Page servers configured to be serviced by 
Dispatcher 402. Any number of Page servers can thus be "plugged" into the 
configuration illustrated in Figure 4, and the Page servers will be instantly 
activated as the information is dynamically updated in Dispatcher 402. The 
25 Web site administrator can thus manage the overhead of each Page server 
and modify each Page server's load, as necessary, to improve performance. 
In this manner, each Page server will cooperate with other Page servers 
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within a multi-server environment. Dispatcher 402 can examine the load on 
each Page server and route new requests according to each Page server's 
available resources. This "load-balancing" across multiple Page servers can 
significantly increase a Web site's performance. 

5 

Figure 5 illustrates the processing of a Web browser request in the 
form of a flow diagram, according to one embodiment of the presently 
claimed invention. A Web browser sends a URL request to a Web server in 
processing block 500. In processing block 502, the Web server receives the 

10 URL request, and an interceptor then intercepts the handling of the request 
in processing block 504. The interceptor connects to a dispatcher and sends 
the URL request to the dispatcher in processing block 506. In processing block 
508, the dispatcher determines which Page servers can handle the request. 
The dispatcher also determines which Page server is processing the fewest 

15 requests in processing block 510, and in processing block 512, the dispatcher 
sends the URL request to an appropriate Page server. The Page server 
receives the request and produces an HTML document in processing block 
514. The Page server then responds to the dispatcher with notification of the 
name of the cached HTML document in processing block 516. In processing 

20 block 518, the dispatcher responds to the interceptor with the document 
name, and the interceptor then replaces the requested URL with the newly 
generated HTML document in processing block 520. The Web server then 
sends the new HTML document to the requesting client in processing block 
522. Finally, the Web browser receives and displays the HTML document 

25 created by the Page server at processing block 524. 

Thus, a method and apparatus for creating and managing custom Web 
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sites is disclosed. These specific arrangements and methods described herein 
are merely illustrative of the principles of the present invention. Numerous 
modifications in form and detail may be made by those of ordinary skill in 
the art without departing from the scope of the present invention. Although 
this invention has been shown in relation to a particular preferred 
embodiment, it should not be considered so limited. Rather, the present 
invention is limited only by the scope of the appended claims. 
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